AITopics | denoising diffusion probabilistic model

Collaborating Authors

denoising diffusion probabilistic model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Star-Shaped Denoising Diffusion Probabilistic Models

Neural Information Processing SystemsDec-24-2025, 04:21:49 GMT

Requests for name changes in the electronic proceedings will be accepted with no questions asked. However name changes may cause bibliographic tracking issues. Authors are asked to consider this carefully and discuss it with their co-authors prior to requesting a name change in the electronic proceedings. Use the Report an Issue link to request a name change.

denoising diffusion probabilistic model, name change, star-shaped denoising diffusion probabilistic model, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Denoising Diffusion Probabilistic Models

Neural Information Processing SystemsDec-24-2025, 00:47:16 GMT

We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. On the unconditional CIFAR10 dataset, we obtain an Inception score of 9.46 and a state-of-the-art FID score of 3.17. On 256x256 LSUN, we obtain sample quality similar to ProgressiveGAN.

denoising diffusion probabilistic model, electronic proceedings, name change, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Image2Gcode: Image-to-G-code Generation for Additive Manufacturing Using Diffusion-Transformer Model

Wang, Ziyue, Jadhav, Yayati, Pak, Peter, Farimani, Amir Barati

arXiv.org Artificial IntelligenceDec-10-2025

Mechanical design and manufacturing workflows conventionally begin with conceptual design, followed by the creation of a computer-aided design (CAD) model and fabrication through material-extrusion (MEX) printing. This process requires converting CAD geometry into machine-readable G-code through slicing and path planning. While each step is well established, dependence on CAD modeling remains a major bottleneck: constructing object-specific 3D geometry is slow and poorly suited to rapid prototyping. Even minor design variations typically necessitate manual updates in CAD software, making iteration time-consuming and difficult to scale. To address this limitation, we introduce Image2Gcode, an end-to-end data-driven framework that bypasses the CAD stage and generates printer-ready G-code directly from images and part drawings. Instead of relying on an explicit 3D model, a hand-drawn or captured 2D image serves as the sole input. The framework first extracts slice-wise structural cues from the image and then employs a denoising diffusion probabilistic model (DDPM) over G-code sequences. Through iterative denoising, the model transforms Gaussian noise into executable print-move trajectories with corresponding extrusion parameters, establishing a direct mapping from visual input to native toolpaths. By producing structured G-code directly from 2D imagery, Image2Gcode eliminates the need for CAD or STL intermediates, lowering the entry barrier for additive manufacturing and accelerating the design-to-fabrication cycle. This approach supports on-demand prototyping from simple sketches or visual references and integrates with upstream 2D-to-3D reconstruction modules to enable an automated pipeline from concept to physical artifact. The result is a flexible, computationally efficient framework that advances accessibility in design iteration, repair workflows, and distributed manufacturing.

artificial intelligence, machine learning, toolpath, (17 more...)

arXiv.org Artificial Intelligence

2511.20636

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Machinery > Industrial Machinery (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MammoRGB: Dual-View Mammogram Synthesis Using Denoising Diffusion Probabilistic Models

Garza-Abdala, Jorge Alberto, Fumagal-González, Gerardo A., Avendano, Daly, Cardona, Servando, Hussain, Sadam, de Avila-Armenta, Eduardo, Toscano-Martínez, Jasiel H., Gurmendi, Diana S. M. Rosales, Pedro-Pérez, Alma A., Tamez-Pena, Jose Gerardo

arXiv.org Artificial IntelligenceDec-1-2025

Purpose: This study aims to develop and evaluate a three channel denoising diffusion probabilistic model (DDPM) for synthesizing single breast dual view mammograms and to assess the impact of channel representations on image fidelity and cross view consistency. Materials and Methods: A pretrained three channel DDPM, sourced from Hugging Face, was fine tuned on a private dataset of 11020 screening mammograms to generate paired craniocaudal (CC) and mediolateral oblique (MLO) views. Three third channel encodings of the CC and MLO views were evaluated: sum, absolute difference, and zero channel. Each model produced 500 synthetic image pairs. Quantitative assessment involved breast mask segmentation using Intersection over Union (IoU) and Dice Similarity Coefficient (DSC), with distributional comparisons against 2500 real pairs using Earth Movers Distance (EMD) and Kolmogorov Smirnov (KS) tests. Qualitative evaluation included a visual Turing test by a non expert radiologist to assess cross view consistency and artifacts. Results: Synthetic mammograms showed IoU and DSC distributions comparable to real images, with EMD and KS values (0.020 and 0.077 respectively). Models using sum or absolute difference encodings outperformed others in IoU and DSC (p < 0.001), though distributions remained broadly similar. Generated CC and MLO views maintained cross view consistency, with 6 to 8 percent of synthetic images exhibiting artifacts consistent with those in the training data. Conclusion: Three channel DDPMs can generate realistic and anatomically consistent dual view mammograms with promising applications in dataset augmentation.

artificial intelligence, machine learning, mammogram, (18 more...)

arXiv.org Artificial Intelligence

2511.22759

Country:

North America (0.14)
South America > Chile (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.89)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Efficiency vs. Fidelity: A Comparative Analysis of Diffusion Probabilistic Models and Flow Matching on Low-Resource Hardware

Gupta, Srishti, Taiwade, Yashasvee

arXiv.org Artificial IntelligenceNov-25-2025

Denoising Diffusion Probabilistic Models (DDPMs) have established a new state-of-the-art in generative image synthesis, yet their deployment is hindered by significant computational overhead during inference, often requiring up to 1,000 iterative steps. This study presents a rigorous comparative analysis of DDPMs against the emerging Flow Matching (Rectified Flow) paradigm, specifically isolating their geometric and efficiency properties on low-resource hardware. By implementing both frameworks on a shared Time-Conditioned U-Net backbone using the MNIST dataset, we demonstrate that Flow Matching significantly outperforms Diffusion in efficiency. Our geometric analysis reveals that Flow Matching learns a highly rectified transport path (Curvature $\mathcal{C} \approx 1.02$), which is near-optimal, whereas Diffusion trajectories remain stochastic and tortuous ($\mathcal{C} \approx 3.45$). Furthermore, we establish an ``efficiency frontier'' at $N=10$ function evaluations, where Flow Matching retains high fidelity while Diffusion collapses. Finally, we show via numerical sensitivity analysis that the learned vector field is sufficiently linear to render high-order ODE solvers (Runge-Kutta 4) unnecessary, validating the use of lightweight Euler solvers for edge deployment. \textbf{This work concludes that Flow Matching is the superior algorithmic choice for real-time, resource-constrained generative tasks.}

artificial intelligence, flow matching, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2511.19379

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.62)

Add feedback

Expert Validation of Synthetic Cervical Spine Radiographs Generated with a Denoising Diffusion Probabilistic Model

Barr, Austin A., Karmur, Brij S., Winder, Anthony J., Guo, Eddie, Lysack, John T., Scott, James N., Morrish, William F., Eesa, Muneer, Willson, Morgan, Cadotte, David W., Yang, Michael M. H., Chan, Ian Y. M., Lama, Sanju, Sutherland, Garnette R.

arXiv.org Artificial IntelligenceOct-28-2025

Machine learning in neurosurgery is limited by challenges in assembling large, high-quality imaging datasets. Synthetic data offers a scalable, privacy-preserving solution. We evaluated the feasibility of generating realistic lateral cervical spine radiographs using a denoising diffusion probabilistic model (DDPM) trained on 4,963 images from the Cervical Spine X-ray Atlas. Model performance was monitored via training/validation loss and Frechet inception distance, and synthetic image quality was assessed in a blinded "clinical Turing test" with six neuroradiologists and two spine-fellowship trained neurosurgeons. Experts reviewed 50 quartets containing one real and three synthetic images, identifying the real image and rating realism on a 4-point Likert scale. Experts correctly identified the real image in 29% of trials (Fleiss' kappa=0.061). Mean realism scores were comparable between real (3.323) and synthetic images (3.228, 3.258, and 3.320; p=0.383, 0.471, 1.000). Nearest-neighbor analysis found no evidence of memorization. We also provide a dataset of 20,063 synthetic radiographs. These results demonstrate that DDPM-generated cervical spine X-rays are statistically indistinguishable in realism and quality from real clinical images, offering a novel approach to creating large-scale neuroimaging datasets for ML applications in landmarking, segmentation, and classification.

artificial intelligence, machine learning, synthetic image, (17 more...)

arXiv.org Artificial Intelligence

2510.22166

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.15)

Genre: Research Report > Experimental Study > Negative Result (0.34)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.70)

Add feedback

InJecteD: Analyzing Trajectories and Drift Dynamics in Denoising Diffusion Probabilistic Models for 2D Point Cloud Generation

Jain, Sanyam, Naveed, Khuram, Oleksiienko, Illia, Iosifidis, Alexandros, Pauwels, Ruben

arXiv.org Artificial IntelligenceSep-17-2025

This work introduces InJecteD, a framework for interpreting Denoising Diffusion Probabilistic Models (DDPMs) by analyzing sample trajectories during the denoising process of 2D point cloud generation. We apply this framework to three datasets from the Datasaurus Dozen bullseye, dino, and circle using a simplified DDPM architecture with customizable input and time embeddings. Our approach quantifies trajectory properties, including displacement, velocity, clustering, and drift field dynamics, using statistical metrics such as Wasserstein distance and cosine similarity. By enhancing model transparency, InJecteD supports human AI collaboration by enabling practitioners to debug and refine generative models. Experiments reveal distinct denoising phases: initial noise exploration, rapid shape formation, and final refinement, with dataset-specific behaviors example, bullseyes concentric convergence vs. dinos complex contour formation. We evaluate four model configurations, varying embeddings and noise schedules, demonstrating that Fourier based embeddings improve trajectory stability and reconstruction quality

artificial intelligence, dataset, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2509.12239

Country: Europe > Finland (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.61)

Add feedback

WaveLLDM: Design and Development of a Lightweight Latent Diffusion Model for Speech Enhancement and Restoration

Santoso, Kevin Putra, Sholikah, Rizka Wakhidatus, Ginardi, Raden Venantius Hari

arXiv.org Artificial IntelligenceSep-1-2025

High-quality audio is essential in a wide range of applications, including online communication, virtual assistants, and the multimedia industry. However, degradation caused by noise, compression, and transmission artifacts remains a major challenge. While diffusion models have proven effective for audio restoration, they typically require significant computational resources and struggle to handle longer missing segments. This study introduces WaveLLDM (Wave Lightweight Latent Diffusion Model), an architecture that integrates an efficient neural audio codec with latent diffusion for audio restoration and denoising. Unlike conventional approaches that operate in the time or spectral domain, WaveLLDM processes audio in a compressed latent space, reducing computational complexity while preserving reconstruction quality. Empirical evaluations on the Voicebank+DEMAND test set demonstrate that WaveLLDM achieves accurate spectral reconstruction with low Log-Spectral Distance (LSD) scores (0.48 to 0.60) and good adaptability to unseen data. However, it still underperforms compared to state-of-the-art methods in terms of perceptual quality and speech clarity, with WB-PESQ scores ranging from 1.62 to 1.71 and STOI scores between 0.76 and 0.78. These limitations are attributed to suboptimal architectural tuning, the absence of fine-tuning, and insufficient training duration. Nevertheless, the flexible architecture that combines a neural audio codec and latent diffusion model provides a strong foundation for future development.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2508.21153

Genre: Research Report (1.00)

Industry: Health & Medicine (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Fast 3D Diffusion for Scalable Granular Media Synthesis

Hassan, Muhammad Moeeze, Cottereau, Régis, Gatti, Filippo, Dec, Patryk

arXiv.org Artificial IntelligenceAug-28-2025

Simulating granular media, using Discrete Element Method is a computationally intensive task. This is especially true during initialization phase, which dominates total simulation time because of large displacements involved and associated kinetic energy. We overcome this bottleneck with a novel generative pipeline based on 3D diffusion models that directly synthesizes arbitrarily large granular assemblies in their final and physically realistic configurations. The approach frames the problem as a 3D generative modeling task, consisting of a two-stage pipeline. First a diffusion model is trained to generate independent 3D voxel grids representing granular media. Second, a 3D inpainting model, adapted from 2D inpainting techniques using masked inputs, stitches these grids together seamlessly, enabling synthesis of large samples with physically realistic structure. The inpainting model explores several masking strategies for the inputs to the underlying UNets by training the network to infer missing portions of voxel grids from a concatenation of noised tensors, masks, and masked tensors as input channels. The model also adapts a 2D repainting technique of re-injecting noise scheduler output with ground truth to provide a strong guidance to the 3D model. This along with weighted losses ensures long-term coherence over generation of masked regions. Both models are trained on the same binarized 3D occupancy grids extracted from small-scale DEM simulations, achieving linear scaling of computational time with respect to sample size. Quantitatively, a 1.2 m long ballasted rail track synthesis equivalent to a 3-hour DEM simulation, was completed under 20 seconds. The generated voxel grids can also be post-processed to extract grain geometries for DEM-compatibility as well, enabling physically coherent, real-time, scalable granular media synthesis for industrial applications.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2508.19752

Country: Europe > France (0.15)

Genre: Research Report (0.42)

Industry:

Transportation > Ground > Rail (0.51)
Transportation > Infrastructure & Services (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Diffusing the Blind Spot: Uterine MRI Synthesis with Diffusion Models

Müller, Johanna P., Knupfer, Anika, Blöss, Pedro, Vittur, Edoardo Berardi, Kainz, Bernhard, Hutter, Jana

arXiv.org Artificial IntelligenceAug-26-2025

Despite significant progress in generative modelling, existing diffusion models often struggle to produce anatomically precise female pelvic images, limiting their application in gynaecological imaging, where data scarcity and patient privacy concerns are critical. To overcome these barriers, we introduce a novel diffusion-based framework for uterine MRI synthesis, integrating both unconditional and conditioned Denoising Diffusion Probabilistic Models (DDPMs) and Latent Diffusion Models (LDMs) in 2D and 3D. Our approach generates anatomically coherent, high fidelity synthetic images that closely mimic real scans and provide valuable resources for training robust diagnostic models. We evaluate generative quality using advanced perceptual and distributional metrics, benchmarking against standard reconstruction methods, and demonstrate substantial gains in diagnostic accuracy on a key classification task. A blinded expert evaluation further validates the clinical realism of our synthetic images. We release our models with privacy safeguards and a comprehensive synthetic uterine MRI dataset to support reproducible research and advance equitable AI in gynaecology.

artificial intelligence, machine learning, synthesis, (15 more...)

arXiv.org Artificial Intelligence

2508.07903

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback